Longitudinal study of ASR performance on ageing voices

نویسندگان

Ravichander Vipperla

Steve Renals

Joe Frankel

چکیده

This paper presents the results of a longitudinal study of ASR performance on ageing voices. Experiments were conducted on the audio recordings of the proceedings of the Supreme Court Of The United States (SCOTUS). Results show that the Automatic Speech Recognition (ASR) Word Error Rates (WERs) for elderly voices are significantly higher than those of adult voices. The word error rate increases gradually as the age of the elderly speakers increase. Use of maximum likelihood linear regression (MLLR) based speaker adaptation on ageing voices improves the WER though the performance is still considerably lower compared to adult voices. Speaker adaptation however reduces the increase in WER with age during old age.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance

With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech Recognition (ASR) and found that the Word Error Rates (WER) on older voices is about 9% absolute higher compared to those of adult voices. Subsequently, we compared several voice source pa...

متن کامل

Impact of Age in ASR for the Elderly: Preliminary Experiments in European Portuguese

Standard automatic speech recognition (ASR) systems use acoustic models typically trained with speech of young adult speakers. Ageing is known to alter speech production in ways that require ASR systems to be adapted, in particular at the level of acoustic modeling. This paper reports ASR experiments that illustrate the impact of speaker age on speech recognition performance. A large read speec...

متن کامل

Template-based ASR using posterior features and synthetic references: comparing different TTS systems

In recent works, the use of phone class-conditional posterior probabilities (posterior features) directly as features has provided successful results in template-based ASR systems. In this paper, motivated by the high quality of current text-to-speech systems and the robustness of posterior features toward undesired variability, we investigate the use of synthetic speech to generate reference t...

متن کامل

Synthetic References for Template-based ASR using posterior features

Recently, the use of phoneme class-conditional probabilities as features (posterior features) for template-based ASR has been proposed. These features have been found to generalize well to unseen data and yield better systems than standard spectralbased features. In this paper, motivated by the high quality of current text-to-speech systems and the robustness of posterior features toward undesi...

متن کامل

Development of New Telephone Speech Databases for French: the NEOLOGOS Project

The NEOLOGOS project is a speech databases creation project for the French language, resulting from a collaboration between French universities and industrial companies, and supported by the French Ministry for Research. The goal of NEOLOGOS is to create new kinds of speech databases: firstly, a 1000 speakers telephone database of children’s voices, called PAIDIALOGOS, following the SpeechDat g...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Longitudinal study of ASR performance on ageing voices

نویسندگان

چکیده

منابع مشابه

Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance

Impact of Age in ASR for the Elderly: Preliminary Experiments in European Portuguese

Template-based ASR using posterior features and synthetic references: comparing different TTS systems

Synthetic References for Template-based ASR using posterior features

Development of New Telephone Speech Databases for French: the NEOLOGOS Project

عنوان ژورنال:

اشتراک گذاری